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With regard to the language, this report is based on: 
|x] international application in the language in which it was filed 

I - ] A translation of the international application into 
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| | .international search (under Rules 12.3(a) and 23.1 (b)) 

| | publication of the international application (under Rule 12.4(a)) 

I' | international preliminary examination (Rules 55.2(a) and/or 55.3(a)) 

With regard to the elements of the international application, this report is based on (replacement sheets which have been 
famished to the receiving Office in response to an invitation under Article 14 are referred to in this report as "originally 
filed 11 and are not annexed to this report): 
Q . the international application as originally filed/furnished 

[xj the description: 

pages 1,6-16 as originally filedVfurnished 

pages*' 2-5 received by this Authority on 6 September 2005 with the letter of the same date 
pages* . received by this Authority on with the letter of 

[x] the claims: 

pages as originally filed/furnished 

pages* as amended (together with any statement) under Article 19 

pages* 17-20 received by this Authority on 6 February 2006 with the letter of the same date 
received by this Authority on with the letter of 



X the drawings: 



pages 1/5-5/5 as originally fUed/furnished 

pages* received by this Authority on with the letter.of 

pages* received by this Authority on with the letter of 

[ | a sequence listing and/or any related table(s) - see Supplemental Box Relating to Sequence Listing. 

3. The amendments have resulted in the cancellation of: 

j | the description, pages 

| | the claims, Nos. 

|. | the drawings, sheets/figs 

f | the sequence listing (specify): 

| any table(s) related to the sequence listing (specify): . 

4. p^j This report has been established as if (some of) the amendments annexed to this report and listed below had not been 

made, since they have been considered to go beyond the disclosure as filed, as indicated in the Supplemental Box (Rule 
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| | the description, pages 

I | the claims, Nos. 

I ] the drawings, sheets/figs 

[ | the sequence listing (specify): 

[ | any table(s) related to the sequence listing (specify): 
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2. Citations and explanations (Rule 70.7) 



CLAIMS 147 



None of the citations in the search report, individually or in combination, disclose the features of the claims. 
Furthermore, none of the distinguishing features over prior art would either be obvious to a person skilled in the 
art or would merely amount to adding common general knowledge. The claims are, therefore, novel and 
inventive. 
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SUMMARY 

In accordance with a first aspect of the present invention there is 
5 provided a method of encoding a document image, the method comprising extracting 
and encoding one or more picture areas from the document image; extracting and 
encoding one or more character areas from the document image; obtaining a 
background image by subtracting the image and character areas from the document 
image; encoding the background image and generating the encoded document image 
10 from the encoded picture areas, the encoded character areas, and the encoded 
background image. 

The character blocks of the character areas may be classified with reference to 
dynamically generated templates. 

15 

The background image may be encoded utilising a SAQ compression algorithm. 

The SAQ compression algorithm may be wavelet-based. 

20 The extracting of the picture areas and/or the character areas may comprise 

marking blocks partitioned from the document image based on features of wavelet 
coefficients of the respective blocks. 

The extracting of the pictures areas may comprise a hierarchical extraction 
25 comprising extracting picture blocks from the document image to generate one or more 
initial picture areas and refining the initial picture areas by extracting picture pixels 
adjacent to the initial picture areas. 

The extracting of the character areas from the document image may comprise 
30 utilising a customised definition of the connectivity of the pixels. 

The method may further comprise generating style data as a description of the 
templates and character blocks. 
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The classifying the character blocks may comprise a hierarchical matching 
comprising matching the style of each character block based on the style data and then 
matching each character block against selected ones of the templates based on the 
style data matching. 

The classifying of the character blocks based on the templates may comprise 
morphological matching. 

The morphological matching may comprise matching algorithms M, and M* 

DHferem. structure elements may be utilised for different types of document 
images. 

The method may further comprise bit plane storage of a compressed stream of 
1 5 the document image in the order of character areas, picture area and background image 
for progressive decoding. 

In accordance with a second aspect of the present Invention there is provided a 
method of decoding a compressed document image stream, the method comprising 
extracting and decoding one or more picture areas from the compressed document 
image stream; extracting and decoding one or more character areas from the 
compressed document image stream; extracting and decoding a background image from 
the compressed data image stream; decoding the background image; and reconstructing 
the decoded document imagB from the decoded picture areas, the decoded character 
25 areas and the decoded background image. 

In accordance with a third aspect of the present invention there is provided a 
computer readable data storage medium having stored thereon code means for 
instructing a computer to execute a method of encoding a document image, the method 

30 comprising extracting and encoding one or more picture areas from the document 
image; extracting and encoding one or more character areas from the document image; 
obtaining a background image by subtracting the image and character areas from the 
document image; encoding the background Image; and generating the encoded 
document image from the encoded picture areas, the encoded character areas, and the 

35 encoded background image. 



20 
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In accordance with a fourth aspect of the present Invention there is provided a 
computer readable data storage medium having stored thereon code means for 
instructing a computer to execute a method of decoding a compressed document image 
stream, the method comprising extracting and decoding one or more picture areas from 
the compressed document image stream; extracting and decoding one or more 
character areas, from the compressed document image stream; extracting and decoding 
a background image from the compressed data image stream; decoding the background 
image; and reconstructing the decoded document image from the decoded picture 
areas, the decoded character areas and the decoded background image. 

In accordance with a fifth aspect of the present invention there is provided a 
system for encoding a document image, the system comprising means for extracting and 
encoding one or more picture areas from the document image; means for extracting and 
encoding one or more character areas from the document image; means for obtaining a 
background image by subtracting fhe image and character areas from the document 
image; means for encoding the background image; and generating the encoded 
document image from the encoded picture areas, the encoded character areas, and the 
encoded background image. 

In accordance with a sixth aspect of the present invention there is provided a 
system for. decoding a compressed document image stream, the system comprising 
means for extracting and decoding one or more picture areas from the compressed 
document Image stream; means for extracting and decoding one or more character 
areas from the compressed document image stream; means for extracting and decoding 
a background image from the compressed data image stream; means for decoding the 
background image; and means for reconstructing the decoded document image from the 
decoded picture areas, the decoded character areas and the decoded background 
image. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Embodiments of the invention will be better understood and readily apparent 
to one of ordinary skill in the art from the following written description, by way of 
example only, and in conjunction with the drawings, in which: 
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Figure 1 shows a block diagram illustrating an encoder process in an 
example embodiment 

Figure 2 shows a block diagram illustrating a decoder process In an example 
embodiment 

5 Figure 3 shows a block diagram illustrating an image block extractor process 

in an example embodiment. 

Figure 4 shows a block diagram Illustrating a process for clustering of 
character images in an example embodiment. 

Figure 5 Is a schematic drawing illustrating a computer system for 
10 implementing the method and system of an example embodiment 

DETAILED DESCRIPTION 



15 



Embodiments of the present invention provide an image compression technique 
for classifying, matching and identifying document images based on a wavelet 
compression method. This method may be referred to as a wavelet. document image 
compression (WD1C) method. More specifically, in embodiments of the present 
invention, the character and picture components may be separated from the 
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CLAIMS 

1 . A method of encoding a document image, the method comprising: 
extracting and encoding one or more pictures from the document image; 

5 extracting and one or more original characters from the document image; 

encoding the original characters utilising a library of templates; 
generating reconstructed characters from the encoded original characters; . 
obtaining a background image by subtracting the pictures and the reconstructed 
characters from the document image; 
10 encoding the background image; and 

generating the encoded document image from the encoded pictures r the encoded 
original characters, and the encoded background image. 

2. The method as claimed in claim 1 f wherein character blocks associated 
1 5 with the original characters are classified with reference to dynamically generated 

templates. 

3. The method as claimed In claim 1 or 2, wherein the background image is 
encoded utilising a SAQ wavelet encoder. 

20 

4. The method as claimed in claims 1 to 3, whereinthe extracting of the 
pictures and/or the characters comprises marking blocks partitioned from the document 
image based on features of wavelet coefficients of the respective blocks. 

25 5. The method as claimed in claims 1 to 4, wherein the extracting of the 

pictures comprises a hierarchical extraction comprising extracting picture blocks from the 
document image to generate one or more initial picture areas and refining the initial 
picture areas by extracting picture pixels adjacent to the Initial picture areas. 

30 6. The method as claimed in any one of claims 1 to 5 ? wherein the extracting 

of the characters from the document image comprises utilising a customised definition of 
the connectivity of the pixels. 
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7. The method as claimed in any one of claim 2, further comprising 
generating style data as a description of the templates and character blocks. 

8. The method as claimed In claim 7, wherein the classifying the character 
5 blocks comprises a hierarchical matching comprising matching the style of each 

character block based on the style data and then matching each character block against 
selected ones of the templates based on the style data matching. 

9. The method as claimed in any one of claim 2, wherein the classifying of 
1 0 the character blocks based on the templates comprises morphological matching. 

1 0. The method as claimed in claim 9, wherein the morphological matching 
comprises matching algorithms M 1 and M 2 . 

15 11 - The method as claimed in claim 10, wherein different structure elements 

are utilised for different types of document images. 

12. The method as claimed in any one of ciaims 1 to 1 1 , further comprising 
bit plane storage of a compressed stream of the document image in the order of 

20 character areas, picture area and background image for progressive decoding. 

13. A method of decoding a compressed document image stream, the 
method comprising: 

extracting and decoding one or more pictures from the compressed document 
25 image stream; 

extracting and decoding one or more reconstructed characters from the 
compressed document Image stream, wherein the reconstructed characters are 
reconstructed from encoded original characters in the document image utilising a library 
of templates; 

30 extracting and decoding a background image from the compressed data image 

stream, wherein the background image includes a subtraction of the pictures and the 
reconstructed characters from the document image; and 

reconstructing the decoded document image from the decoded pictures, the 
decoded reconstructed characters and the decoded background image. 
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14. A computer readable data storage medium having stored thereon code 
means for Instructing a computer to execute a method of encoding a document image, 
the method comprising: 
5 extracting and encoding one or more pictures from the document image; 

extracting one or more original characters from the document image; 
encoding the original characters utilising a library of templates: 
generating reconstructed characters from the encoded original characters; 
obtaining a background image by subtracting the pictures and the reconstructed 
1 0 characters from the document image; 

encoding the background image; and 

generating the encoded document image from the encoded pictures, the encoded 
original characters, and the encoded background image. 

16 15. A computer readable data storage medium having stored thereon code 

means for instructing a computer to execute a method of decoding a compressed 
document image stream, the method comprising: 

extracting and decoding one or more pictures from the compressed document 
image stream; . 

20 extracting and decoding one or more endoded reconstructed characters from the 

compressed document image stream, wherein the reconstructed characters are 
reconstructed from encoded original characters in the document image utilising a library 
of templates; 

extracting and decoding a background image from the compressed data image 
25 stream, wherein the background image includes a subtraction of the pictures and the 
reconstructed characters from the document image; and 

reconstructing the decoded document image from the decoded pictures, the 
decoded reconstructed characters and the decoded background image. 

30 16. A system for encoding a document image, the system, comprising: 

means for extracting and encoding one or more picture from the document 

image; 

means for extracting and one or more original characters from the document 

image; 
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means for encoding the original characters utilising a library of templates; 
means for generating reconstructed characters from the encoded original • 
characters; 

means for obtaining a background image by subtracting the pictures and the 
5 reconstructed characters from the document image, whereby the background image 
comprises residual image data representing differences between the original characters 
and the templates; 

means for encoding the background image; and 

means for generating the encoded document image from the encoded pictures, the 
1 0 encoded original characters, and the encoded background image. 

17. A system for decoding a compressed document image stream, the 
system comprising: 

means for extracting and decoding one or more pictures from the compressed 
15 document image stream; 

means for extracting and decoding one or more reconstructed characters from 
the compressed document image stream, wherein the reconstructed characters are 
reconstructed from encoded original characters in the document image utilising a library . 
of templates; 

20 means for extracting and decoding a background image from the compressed 

data image stream, wherein the background image includes a subtraction of the pictures 
and the reconstructed characters from the document image; and 

means for reconstructing the decoded document image from the decoded 
pictures, the decoded reconstructed characters and the decoded background image. 
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